9 research outputs found

    Constrained Nonnegative Matrix Factorization with Applications to Music Transcription

    Get PDF
    In this work we explore using nonnegative matrix factorization (NMF) for music transcription, as well as several other applications. NMF is an unsupervised learning method capable of finding a parts-based additive model of data. Since music has an additive property (each time point in a musical piece is composed of a sum of notes) NMF is a natural fit for analysis. NMF is able to exploit this additivity in order to factorize out both the individual notes and the transcription from an audio sample. In order to improve the performance of NMF we apply different constraints to the model. We consider sparsity as well as piecewise smoothness with aligned breakpoints. We show the novelty of our method on real music data and demonstrate promising results which exceed the current state of the art. Other applications are also considered, such as instrument and speaker separation and handwritten character analysis

    Learning Sparse Orthogonal Wavelet Filters

    Get PDF
    The wavelet transform is a well studied and understood analysis technique used in signal processing. In wavelet analysis, signals are represented by a sum of self-similar wavelet and scaling functions. Typically, the wavelet transform makes use of a fixed set of wavelet functions that are analytically derived. We propose a method for learning wavelet functions directly from data. We impose an orthogonality constraint on the functions so that the learned wavelets can be used to perform both analysis and synthesis. We accomplish this by using gradient descent and leveraging existing automatic differentiation frameworks. Our learned wavelets are able to capture the structure of the data by exploiting sparsity. We show that the learned wavelets have similar structure to traditional wavelets. Machine learning has proven to be a powerful tool in signal processing and computer vision. Recently, neural networks have become a popular and successful method used to solve a variety of tasks. However, much of the success is not well understood, and the neural network models are often treated as black boxes. This thesis provides insight into the structure of neural networks. In particular, we consider the connection between convolutional neural networks and multiresolution analysis. We show that the wavelet transform shares similarities to current convolutional neural network architectures. We hope that viewing neural networks through the lens of multiresolution analysis may provide some useful insights. We begin the thesis by motivating our method for one-dimensional signals. We then show that we can easily extend the framework to multidimensional signals. Our learning method is evaluated on a variety of supervised and unsupervised tasks, such as image compression and audio classification. The tasks are chosen to compare the usefulness of the learned wavelets to traditional wavelets, as well as provide a comparison to existing neural network architectures. The wavelet transform used in this thesis has some drawbacks and limitations, caused in part by the fact that we make use of separable real filters. We address these shortcomings by exploring an extension of the wavelet transform known as the dual-tree complex wavelet transform. Our wavelet learning model is extended into the dual-tree domain with few modifications, overcoming the limitations of our standard model. With this new model we are able to show that localized, oriented filters arise from natural images

    Learning Filters for the 2D Wavelet Transform

    Get PDF
    ©2018 IEEEWe propose a new method for learning filters for the 2D discrete wavelet transform. We extend our previous work on the 1D wavelet transform in order to process images. We show that the 2D wavelet transform can be represented as a modified convolutional neural network (CNN). Doing so allows us to learn wavelet filters from data by gradient descent. Our learned wavelets are similar to traditional wavelets which are typically derived using Fourier methods. For filter comparison, we make use of a cosine measure under all filter rotations. The learned wavelets are able to capture the structure of the training data. Furthermore, we can generate images from our model in order to evaluate the filters. The main findings of this work is that wavelet functions can arise naturally from data, without the need for Fourier methods. Our model requires relatively few parameters compared to traditional CNNs, and is easily incorporated into neural network frameworks.Natural Sciences and Engineering Research Council of Canad
    corecore